27 research outputs found

    Towards linked data for Wikidata revisions and Twitter trending hashtags

    Get PDF
    This paper uses Twitter as a microblogging platform to link hashtags, which relate the message to a topic that is shared among users, to Wikidata, a central knowledge base of information relying on its members and machine bots to keeping its content up to date. The data is stored in a highly structured format, with the added SPARQL Protocol And RDF Query Language (SPARQL) endpoint to allow users to query its knowledge base. Our research, designs and implements a process to stream live Twitter tweets and to parse existing Wikidata revision XML files provided by Wikidata. Furthermore, we identify if a correlation exists between the top Twitter hashtags and Wikidata revisions over a seventy-seven-day period.We have used statistical evaluation tools, such as ‘Jaccard Ratio’ and ‘Kolmogorov-Smirnov’ to investigate a significant statistical correlation between Twitter hashtags and Wikidata revisions over the studied period

    Linked Data Quality Assessment: A Survey

    Get PDF
    Data is of high quality if it is fit for its intended use in operations, decision-making, and planning. There is a colossal amount of linked data available on the web. However, it is difficult to understand how well the linked data fits into the modeling tasks due to the defects present in the data. Faults emerged in the linked data, spreading far and wide, affecting all the services designed for it. Addressing linked data quality deficiencies requires identifying quality problems, quality assessment, and the refinement of data to improve its quality. This study aims to identify existing end-to-end frameworks for quality assessment and improvement of data quality. One important finding is that most of the work deals with only one aspect rather than a combined approach. Another finding is that most of the framework aims at solving problems related to DBpedia. Therefore, a standard scalable system is required that integrates the identification of quality issues, the evaluation, and the improvement of the linked data quality. This survey contributes to understanding the state of the art of data quality evaluation and data quality improvement. A solution based on ontology is also proposed to build an end-to-end system that analyzes quality violations\u27 root causes

    Interrupting The Propaganda Supply Chain

    Get PDF
    In this early-stage research, a multidisciplinary approach is presented for the detection of propaganda in the media, and for modeling the spread of propaganda and disinformation using semantic web and graph theory. An ontology will be designed which has the theoretical underpinnings from multiple disciplines including the social sciences and epidemiology. An additional objective of this work is to automate triple extraction from unstructured text which surpasses the state-of-the-art performance

    Extending R2RML-F to support dynamic datatype and language tags

    Get PDF
    Linked data is often generated from raw data with the help of mapping languages. Complex data transformation is one of the essential parts while uplifting data which either can be implemented as custom solutions or separated from the mapping process. In this paper, we propose an approach of separating complex data transformations from the mapping process that can still be reusable across the systems. In the proposed method, complex data transformations include the entailment of (i) language tag and (ii) datatype present at the data source. The proposed method also includes inferring missing datatype information. We extended R2RML-F to handle data transformations. The results showed that transformation functions could be used to create typed literals dynamically. Our approach is validated on the test cases specified by the RDF mapping language (RML). The proposed method considers data in the form of JSON, thus making the system interoperable and reusable

    (Linked) Data Quality Assessment: An Ontological Approach

    Get PDF
    The effective functioning of data-intensive applications usually requires that the dataset should be of high quality. The quality depends on the task they will be used for. However, it is possible to identify task-independent data quality dimensions which are solely related to data themselves and can be extracted with the help of rule mining/pattern mining. In order to assess and improve data quality, we propose an ontological approach to report data quality violated triples. Our goal is to provide data stakeholders with a set of methods and techniques to guide them in assessing and improving data qualit

    KnowText: Auto-generated Knowledge Graphs for custom domain applications

    Get PDF
    While industrial Knowledge Graphs enable information extraction from massive data volumes creating the backbone of the Semantic Web, the specialised, custom designed knowledge graphs focused on enterprise specific information are an emerging trend. We present “KnowText”, an application that performs automatic generation of custom Knowledge Graphs from unstructured text and enables fast information extraction based on graph visualisation and free text query methods designed for non-specialist users. An OWL ontology automatically extracted from text is linked to the knowledge graph and used as a knowledge base. A basic ontological schema is provided including 16 Classes and Data type Properties. The extracted facts and the OWL ontology can be downloaded and further refined. KnowText is designed for applications in business (CRM, HR, banking). Custom KG can serve for locally managing existing data, often stored as “sensitive” information or proprietary accounts, which are not on open web access. KnowText deploys a custom KG from a collection of text documents and enable fast information extraction based on its graph based visualisation and text based query methods

    Comparing tagging suggestion models on discrete corpora

    Get PDF
    This paper aims to investigate the methods for the prediction of tags on a textual corpus that describes diverse data sets based on short messages; as an example, the authors demonstrate the usage of methods based on hotel staff inputs in a ticketing system as well as the publicly available StackOverflow corpus. The aim is to improve the tagging process and find the most suitable method for suggesting tags for a new text entry

    Sphaerospora molnari (myxozoa) kod šaranske mlađi

    Get PDF
    Sferosporidioza škrga je obolenje riba izazvano parazitom Sphaerospora molnari koji napada škrge i kožu. Prvo pojavljivanje sferosporidioze škrga kod šaranskih mladunaca utvrđeno je u Mađarskoj još 1972, zatim u Češkoj i Poljskoj, dok je kod nas obolenje prisutno od sredine osamdesetih godina prošlog veka. Molnar, koji je prvi је izučavao patogeni efekat ovog uzročnika, najpre ga je identifikovao kao Sphaerospora carassi. Češki istraživači Lom et Dycova detektovali su uzročnika sferosporidioze škrga iz škržnog materijala obolelih mladunaca šarana pomoću histološke sekcije tkiva i predložili da se parazitu da ime Sphaerospora molnari. Ovo obolenje je dosta često kod mladunaca ribnjačkog šarana i amura, pri čemu intenzitet infestacije može dostići čak i do 100%. Cilj ovog rada jeste da se utvrdi prisustvo obolenja izazvanog parazitom Sphaerospora molnari i da se isprate kliničke i patohistološke promene kod infestiranih mladunaca šarana. Istraživanja su sprovedena na 18 šaranskih ribnjaka u Srbiji, od 2008. do 2012. godine, a u sklopu sistemskog monitoringa najznačajnijih protozooza šarana. Mladunci šarana pregledani su tokom čitavog vegetacionog perioda. Praćene su kliničke promene, i uzimani su uzorci za nativnu mikroskopiju koja je rađena pomoću svetlosnog mikroskopa. Od inficiranih jedinki uzimano je tkivo škrga za patohistološku analizu koja je sprovedena klasičnom metodologijom, fiksiranjem u 10% formalinu, sečenjem 5 μm velikih isečaka koji su kalupljeni u parafin i bojenjem isečaka pomoću H&E. Prisustvo S. molnari ustanovljeno je kod mladunaca šarana od 20 dana do 3 meseca starosti. Na škrgama su bili prisutni razvojni stadijumi i zrele spore što se moglo uočiti na stratifikovanom epitelu škržnih filamenata. Spore su invadirale epitel i formirale velike klastere. Akumulacija razvojnih stadijuma i zrelih spora bila je prisutna je i kod dvostrukog sloja epitelnih ćelija koje pokrivaju sekundarne lamele, i to najčešće između unutrašnjeg i spoljašnjeg omotača izazivajući tako distenziju tkiva. Zaražene lamele podležu nekrozi, što dovodi do kretanja spora prema spolja. Veličina spora iznosila je 10 x 10 μm. Klinički, obolenje se manifestovalo pojavom beličastih depozita na škrgama kao posledica agregacije parazita na njima, pri čemu paraziti mogu da zauzmu i do 80% površine slojevitog epitela, prekrivajući pločice i lukove škrga. Pritisak parazita koji se razmnožavaju je takav da vrši ćelijsku deformaciju tkiva i na kraju se uočava istančanost citoplazme ćelija šktžnog epitela u obliku mreže. Pošto spore prekrivaju najveći deo respiratornog epitela, smanjuju otpornost organizma i stvaraju uslove za razvoj drugih uzročnika obolenja (prvenstveno trematoda), što Sphaerosporu molnari svrstava u patogene parazite. Lokalizacija, veličina spora odnosno razvojnih stadijuma S. molnari, kao i kliničke i patohistološke promene zabeležene tokom ovog istraživanja odgovaraju rezultatima koje su opisali ostali istraživači koji su se bavili ovom problematikom. Pošto ne postoji ni jedno adekvatno terapeutsko sredstvo, kontrola sferopsoridioze i dalje se bazira na pridržavanju osnovnih sanitarno–profilaktičkih mera, kao što su isušivanje objekata, izmrzavanje, mehanička obrada tla i dezinfekcija krečom

    Particle Swarm Optimization of Convolutional Neural Networks for Human Activity Prediction

    No full text
    The increased usage of smartphones for daily activities has created a huge demand and opportunities in the field of ubiquitous computing to provide personalized services and support to the user. In this aspect, Sensor-Based Human Activity Recognition (HAR) has seen an immense growth in the last decade playing a major role in the field of pervasive computing by detecting the activity performed by the user. Thus, accurate prediction of user activity can be valuable input to several applications like health monitoring systems, wellness and fit tracking, emergency communication systems etc., Thus, the current research performs Human Activity Recognition using a Particle Swarm Optimization (PSO) based Convolutional Neural Network which converges faster and searches the best CNN architecture. Using PSO for the training process, intends to optimize the results of the solution vectors on CNN which in turn improve the classification accuracy to reach the quality performance compared to the state-of-the-art designs. The study investigates the performances of PSO-CNN algorithm and compared with that of classical machine leaning algorithms and deep learning algorithms. The experiment results showed that the PSO-CNN algorithm was able to achieve the performance almost equal to the state-of-the-art designs with a accuracy of 93.64%. Among machine learning algorithms, Support Vector machine found to be best classifier with accuracy of 95.05% and a Deep CNN model achieved 92.64% accuracy score
    corecore